JMIRx Med
◐ JMIR Publications Inc.
All preprints, ranked by how well they match JMIRx Med's content profile, based on 31 papers previously published here. The average preprint has a 0.06% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Dabestani, A.; Bazil, C. W.; Costantino, R. C.; Fox, E.; Graedon, J.; Lever, H.; Makuch, R.; White, C. M.
Show abstract
The quality of drug products in the United States, which are largely produced overseas, has been a matter of growing concern.1 Buyers and payers of pharmaceuticals, whether they are health-systems, insurers, PBMs, pharmacies, physicians, or patients, have little to no visibility into any quality metrics for the manufacturers of drug products or the products themselves. A system of "quality scores" is proposed to enable health-systems and other purchasers and payers of medication to differentiate among drug products according to evidence-based metrics. Metrics influencing the quality scores described herein include both broadly applicable regulatory information and more drug-specific, third-party chemical analysis information. The aggregation of these metrics through a proposed set of rules results in numerical values on a 0-100 scale that may be further simplified into a red/yellow/green designation. The simplicity of such scores enables seamless integration into existing healthcare systems and an integration scheme is proposed. Using real-world data from currently on-market valsartan drug products, this proposed system generated a variety of quality scores for six major manufacturers. These scores were further evaluated according to their current market price showing no significant correlation between quality score and price. The implementation of drug quality scores at healthcare institutions in the United States and their potential utilization by regulators, could create a much-needed, market-driven incentive for pharmaceutical manufacturers to produce quality medications that would reduce drug shortages and improve public health.
Dey, S.
Show abstract
A time-series model was developed for Number of Total Infected Cases in India, using data from Mar 3 to May 7, 2020. Two models developed in the early phases were discarded when they lost statistical validity, The third, current, model is a 3rd-degree polynomial that has remained stable over the last 30 days (since Apr 8), with R2 > 0.998 consistently. This model is used to forecast Total Covid cases, after cautionary discussion of triggers that would invalidate the model. The purpose of all forecasts in the study is to provide a comparator to evaluate policy initiatives to control the pandemic - the forecasts are not objectives by themselves. Actual observations less than forecasts mean successful policy interventions. Figures of Doubling Time, Fatality Rate and Recovery Rate used by authorities are questioned. Elongation of doubling rates is inherent in the model, and worthy of mention only when the time actually exceeds what the model predicts. The popular Fatality Rate and Recovery Rate metrics are shown to be illogical. The study defines two terms Ongoing Fatality Rate (OFR), and Ongoing Recovery Rate (ORR) and determines these currently to be ~9% and ~76% respectively in India. Over time, OFR will decline to the eventual Case Fatality Rate (CFR), while ORR will eventually climb to (1-CFR). There is no statistical basis to assume eventual Indian CFR, and Chinas 5.5% CFR is used as a proxy. Using these metrics, the current model forecasts by May-end, >150000 Total Infected, ~5000 Deaths and >85000 Active Cases. There is no pull-back evident in the current model in the foreseeable future, and cases continue to rise at progressively slower rates. Subject to usual caveats, the model is used to forecast till Sep 15. The study argues that Indian hospital infrastructure is reasonably ready to handle Active Cases as predicted for Sept 15 - in that sense, the curve is "flat enough". However, the curve is NOT flat enough with respect to fatalities - nearly 100000 by Sept 15. Setting an arbitrary limit that Total Deaths must be within 50,000 by Sept 15, the study retrofits a model that shows what the desired growth of Covid19 cases should be. It is seen that overall doubling time of 38 days is required in period June 1 to Sep 15, if deaths are to remain below 50000.
Sengupta, S.; Mugde, S.; Sharma, G.
Show abstract
Twitter is one of the worlds biggest social media platforms for hosting abundant number of user-generated posts. It is considered as a gold mine of data. Majority of the tweets are public and thereby pullable unlike other social media platforms. In this paper we are analyzing the topics related to mental health that are recently (June, 2020) been discussed on Twitter. Also amidst the on-going pandemic, we are going to find out if covid-19 emerges as one of the factors impacting mental health. Further we are going to do an overall sentiment analysis to better understand the emotions of users. Executive SummeryNovel Corona viruss spread and its impact on various aspects of national and individuals well-being has been at the center of lot of discussions across micro-blogging sites and various social media platforms ever since it commenced in December 2019. Users are voicing their opinions on several topics related to covid-19. Social distancing as prescribed by Government and Local Administration We all are aware that the Novel Corona virus has significantly affected our physical health; however the current social distancing norms are taking a toll on the psychological well-being of individuals. The research paper presents a two-phased analysis of most recent 2000 tweets related to mental health pulled out twice over a span of one month on 28 June 2020 and 28 July2020 respectively, thereby analyzing 4000 tweets in total. The second phase analysis was conducted exactly after a gap of one month to validate the results generated by the first analysis. The intention is to analyze to what extent people have discussed about mental health in the past few months based on the information disseminated on Twitter. Data was extracted using Twitters search application programming interface (API) and Pythons tweepy library. A predefined keyword like mental health was given to find out if Covid-19 emerges as a reason for the same. Several natural language processing (NLP) techniques like tokenization, removing URL and stop words, stemming and lemmatization were used to pre-process the text data and make it ready for analysis. These collected tweets were analyzed using word frequencies of single and double words (unigram, bigram). A very unique feature of this analysis includes a network diagram that shows interconnections between the set of most common words used in to its and the connections (if any) are represented through links. Topic modeling technique in NLP visualizes the top concerns of tweeters through a word cloud. At present we have many methods to do topic modeling. In this paper we are using the Latent Dirichlet Allocation (LDA) method which is a probabilistic approach of modeling given by Prof David H.B in 2003. This model deals with distribution of topics to tweets and allocation of those topics to documents and words to topics. Finally a sentiment analysis is done using text mining techniques to analyze the sentiment of the tweets in the form of positive, negative and neutral.
Sengupta, S.; Mugde, S.; Sharma, G.
Show abstract
India reported its first Covid-19 case on 30th Jan 2020 and the number of cases reported heavily escalated from March, 2020. This research paper analyses COVID -19 data initially at a global level and then drills down to the scenario obtained in India. Data is gathered from multiple data sources-several authentic government websites. The need of the hour is to accurately forecast when the numbers will reach at its peak and then diminish. It will be of huge help to public welfare professionals to plan the preventive measures to be taken keeping the economic balance of the country as well. Variables such as gender, geographical location, age etc. have been represented using Python and Data Visualization techniques. Time Series Forecasting techniques including Machine Learning models like Linear Regression, Support Vector Regression, Polynomial Regression and Deep Learning Forecasting Model like LSTM(Long short-term memory) are deployed to study the probable hike in cases and in the near future. A comparative analysis is also done to understand which model fits the best for our data. Data is considered till 30th July, 2020. The results show that a statistical model named sigmoid model is outperforming other models. Also the Sigmoid model is giving an estimate of the day on which we can expect the number of active cases to reach its peak and also when the curve will start to flatten. Strength of Sigmoid model lies in providing a count of date that no other model offers and thus it is the best model to predict Covid cases counts -this is unique feature of analysis in this paper. Certain feature engineering techniques have been used to transfer data into logarithmic scale as is affords better comparison removing any data extremities or outliers. Based on the predictions of the short-term interval, our model can be tuned to forecast long time intervals.
Subramaniyan, B.; Mahapatra, P.; Mohamud, M.; Naqvi, A.
Show abstract
This research evaluates the usability of Pathpoint(R) Outcomes Software, aligning with IEC 62366-1:2015 standards to ensure rigour and accuracy. The study involved diverse user groups like Healthcare Professionals and the General Population to provide varied perspectives. The study focused on the softwares usability, safety, and effectiveness, which are crucial to user experience, satisfaction, and performance. Any potential risks to users or patient data were identified to ensure safety standards. The softwares effectiveness was assessed, considering its accuracy, reliability, and efficiency. The study used both quantitative metrics and qualitative feedback for a balanced evaluation. It also aimed to identify and mitigate potential use errors, enhancing the softwares usability, safety, and effectiveness.
Singh, A.; Gupte, S. S.
Show abstract
Covid-19, just like SARS and MERS before it, is a disease caused by corona virus and can lead to severe respiratory diseases in humans. With the outbreak of novel corona virus, WHO on 30th January 2020 declared it a Public Health Emergency and further on 11th March 2020, Covid-19 disease was declared a pandemic. India in the initial stages of the pandemic dealt with it in a very effective manner. With timely implementation of lockdown, India was able to contain the spread of Covid-19 to some extent. However with the recently announced Unlock 1.0, the SARS CoV-2 is expected to spread. This study aims to track and analyze the Covid-19 situation in major states that constitute of 70 percent of the total cases. Thus the states selected for the study are: Maharashtra, Delhi, Tamil Nadu, Gujarat, Uttar Pradesh and Rajasthan. These are the states which had more than ten thousand Covid-19 patients as/on June 10th 2020. The analysis period is from March 25th to June 10th and the data source is Indias Covid-19 tracker. To assess the previous and current Covid-19 situations in these states indicators such as Active rates, Recovery rate, Case fatality rate, Test positivity rate, tests per million, cases per million, test per confirmed case has been used. The study finds that although the absolute number of active cases may be rising, however it is showing a decreasing trend with an increase in recovery rates. With increasing number of Covid-19 cases, testing also has increased however not in the similar proportion and thus by developed nation standard we are lagging. With increasing TPR and cases per million, Delhi is well on its way to surpass even Mumbai which till now has proven to be worst hit in this pandemic. An interesting finding is that of test per confirmed case which shows that every 6th person in Maharashtra and every 8th in Delhi is showing positive result of Covid-19 test. Given such an increase and unlocked India, Delhi might soon enter into the third stage of community transmission where source of 50 percent or more cases would be unknown. There has been an increase in the Covid-19 related health infrastructure with the public-private partnership which involved both private hospitals and lab joining hands to battle Covid-19, however, affordability still remains an issue. If experts are to be believed, pandemic isnt over because weve unlocked. The worst is yet to come as Covid-19 is predicted to peak in mid-July to August in India. Thus, itd be advisable to not venture out unnecessarily just because restrictions have been lifted. Also, following the guidelines-hand-washing, avoiding public gathering, social distancing and covering nose and mouth has now become imperative.
Gemmar, P.
Show abstract
The pandemic spread of coronavirus leads to increased burden on healthcare services worldwide. Experience shows that required medical treatment can reach limits at local clinics and fast and secure clinical assessment of the disease severity becomes vital. In [1] a model is presented for predicting the mortality of COVID-19 patients from their biomarkers. Three biomarkers have been selected by ranking with a supervised Multi-tree XGBoost classifier. The prediction model is built up as a binary decision tree with depth three and achieves AUC scores of up to 97.84{+/-}0.37 and 95.06{+/-} 2.21 for training and external test data sets, resp. In human assessment and decision making influencing parameters usually arent considered as sharp numbers but rather as Fuzzy terms [2], and inferencing primarily yields Fuzzy terms or continuous grades rather than binary decisions. Therefore, I examined a Sugenotype Fuzzy classifier [3] for disease assessment and decision support. In addition, I used an artificial neural network (SOM, [4]) for selecting the biomarkers. Modelling and validation was done with the identical data base provided by [1]. With the complete training and test data sets, the Fuzzy prediction model achieves improved AUC scores of up to 98.59 or 95.12 The improvements with the Fuzzy classifier obviously become clear as physicians can interpret output grades to belong to positive or negative class more or less strongly. An extension of the Fuzzy model, which takes into account the trend in key features over time, provides excellent results with the training data, which, however, could not be finally verified due to the lack of suitable test data. The generation and training of the Fuzzy models was fully automatic and without additional adjustment with the help of ANFIS from Matlab(C).
Shekhar, H.
Show abstract
The article describe modelling efforts for evaluating the current level of COVID-19 infections in India, using exponential model. The Data from 15 march 2020 to 30 April 2020 are used for validating the model, where intrinsic rise rate is kept constant. It is observed that some states of India, like MAharastra, Gujarat and Delhi have a much higher daily infection cases. This is modelled by assuming an initial higher infections, keeping rise rate same. The sudden outbursts are captured using offset of values for these three states. Data from other states like Madhya Pradesh, Uttar Pradesh and Rajasthan are also analysed and they are found to be following the same constants as India is following. Worldwide, many attempts are made to predict outburst of COVID-19 and in the model, described in this paper, turning point is not predicted, as cases in India are still rising. The developed model is based on daily confirmed infections and not on cumulative infections and rationalization is carried out for the population of various regions, while predicting infections for various states. Assigning a decay constant at this stage will be a premature exercise and keeping that in mind, exponential model predicts that India will attain 1 lakh case by 15 May 2020. The figure of 2 lakh and 3 lakh will be attained on 22 May 2020 and 26 May 2020, respectively.
Chire Saire, J. E.; Oblitas Cruz, J. F.
Show abstract
The fast spreading of coronavirus name covid19, generated the actual pandemic forcing to change daily activities. Health Councils of each country promote health policies, close borders and start a partial or total lockdown. One of the first countries in Europe with high impact was Italy. Besides at the end of April, one country with a shared border was on the top of 10 countries with more total cases, then France started with its own battle to beat coronavirus. This paper studies the impact of coronavirus in the poopulation of Paris, France from April 23 to June 18, using Text Mining approach, processing data collected from Social Network and using trends related of searching. First finding is a decreasing pattern of publications/interest, and second is related to health crisis and economical impact generated by coronavirus.
Martin-Rodriguez, F.; Pajaro-Lorenzo, J.; Isasi-de-Vicente, F.; Fernandez Barciela, M.
Show abstract
This paper is about the application of known machine learning (ML) techniques for the prediction of heart disease risk. A public database is used to train and test the ML models. Results are evaluated using standard measures like precision, recall and F-score. ML models selected are well known techniques and they are based on different approaches. Chosen methods are: MLP (Multi-Layer Perceptron), SVM (Support Vector Machine) and Bagged Tree (Bootstrap Aggregated Trees). After evaluating techniques alone on their own, a new "triple voting method" (TVM) is tested applying the three individual methods and "adding" their results to improve accuracy.
Rajeshbhai, S. P.; Dhar, S. S.; Shalabh, S.
Show abstract
The pandemic due to the SARS-CoV-2 virus impacted the entire world in different waves. An important question that arise after witnessing the first and second waves of COVID-19 is - Will the third wave also arrive and if yes, then when. Various types of methodologies are being used to explore the arrival of third wave. A statistical methodology based on the fitting of mixture of Gaussian distributions is explored in this paper and the aim is to forecast the third wave using the data on the first two waves of pandemic. Utilizing the data of different countries that are already facing the third wave, modelling of their daily cases data and predicting the impact and timeline for the third wave in India is attempted in this paper. The Gaussian mixture model based on algorithm for clustering is used to estimate the parameters.
Gupta, R.; Pal, S. K.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWCOVID-19 is spreading really fast around the world. The current study describes the situation of the outbreak of this disease in India and predicts the number of cases expected to rise in India. The study also discusses the regional analysis of Indian states and presents the preparedness level of India in combating this outbreak. The study uses exploratory data analysis to report the current situation and uses time-series forecasting methods to predict the future trends. The data has been considered from the repository of John Hopkins University and covers up the time period from 30th January 2020 when the first case occurred in India till the end of 24th March 2020 when the Prime Minister of India declared a complete lockdown in the country for 21 days starting 25th March 2020. The major findings show that number of infected cases in India is rising quickly with the average infected cases per day rising from 10 to 73 from the first case to the 300th case. The current mortality rate for India stands around 1.9. Kerala and Maharashtra are the top two infected states in India with more than 100 infected cases reported in each state, respectively. A total of 25 states have reported at least one infected case, however only 8 of them have reported deaths due to COVID-19. The ARIMA model prediction shows that the infected cases in India may reach up to 700 thousands in next 30 days in worst case scenario while most optimistic scenario may restrict the numbers up to 1000-1200. Also, the average forecast by ARIMA model in next 30 days is around 7000 patients from the current numbers of 536. Based on the forecasting model by Holts linear trends, an expected 3 million people may get infected if control measures are not taken in the near future. This study will be useful for the key stakeholders like Government Officials and Medical Practitioners in assessing the trends for India and preparing a combat plan with stringent measures. Also, this study will be helpful for data scientists, statisticians, mathematicians and analytics professionals in predicting outbreak numbers with better accuracy.
Ahmad, T.; Ashaari, A.; Awang, S. R.; Mamat, S. S.; Wan Mohamad, W. M.; Ahmad Fuad, A. A.; Hassan, N.
Show abstract
The objective of this research is to identify severity zones for the COVID-19 outbreak in Malaysia. The technique employed for the purpose is fuzzy graph that can accommodate scarcity, quantity, and availability of data set. Two published sets of data by the Ministry of Health of Malaysia are used to implement the technique. The obtained results can offer descriptive insight, reflection, assessment, and strategizing actions in combating the pandemic.
Chire Saire, J. E.; Lemus-Martin, R.
Show abstract
During the last months, the pandemic generated by coronavirus SARS-CoV-2 moved governments, research groups and health organizations to plan, test and execute health policies. At the beginning, the treatment was not clear but in the next months, research groups have conducted studies to recommend or ban some medicines. At the same time, countries with most cases were changing from Asia, Europe to America. Different treatments and medications have been tested and recommended in many countries. such as Hydroxycloroquine, Ivermectin, Azithromycin, Dexamethasone, Prednisone, Desivir. This paper is a preliminary study about what people are searching on Internet, considering ten countries with most cases in the world such as Chile, Spain, United Kingdom, Brazil, United States, India, Russia, South Africa, Peru, Mexico.
Chaurasia, A. R.
Show abstract
Mortality in India remains high by international standards. This paper analyses mortality transition in India during the 70 years since 1950 based on the annual estimates of age-specific probabilities of death prepared by the United Nations Population Division for the period 1950-2021. The analysis reveals that characterisation of mortality transition is sensitive to the summary index of mortality used. Mortality transition in India based on the geometric mean of the age-specific probabilities of death is found to be different from that based on the life expectancy at birth. The transition in mortality based on the geometric mean of age-specific probabilities of death accelerated during 2008-2019 but decelerated when based on the life expectancy at birth. The reason is that mortality transition in younger ages has been faster than mortality transition in older ages. The analysis also reveals that there were around 4.3 excess deaths associated with the COVID-19 epidemic in the country leading to a loss of around 3.7 years in the life expectancy at birth between 2019 and 2021.
alasousi, l. f.; alhammouri, s.; alabdulhadi, s.
Show abstract
BackgroundRising fear and panic among public during COVID19 pandemic increase concern regarding anxiety cases in Kuwait. Media capture our attention during this period looking for daily virus update lead to more fear. Our purpose of this study to examine the relationship between anxiety and media exposure among Kuwaiti during COVID19 outbreak Methodcross sectional study among Kuwaiti citizen between age23-55yrs old was conducted from April,21,2020 to May,15,2020 using online survey. Total of 1230 participants involve in the current study after exclusion criteria removed. Beside demographic data and media exposure anxiety was assessed using generalized anxiety disorder scale GAD-7, multivariable regression was used to identify the correlation between anxiety and media exposure Resultthe result show that there is positive correlation between media exposure and anxiety during COVID19 outbreak in Kuwait (p<.001), furthermore it revealed that there is significant relationship between the frequency of exposure and anxiety(<.001) Conclusionfrom this study we can understand that during COVID19 pandemic exposure to media can cause anxiety therefore measures should be taken by the governments to fight misinformation and physician should pay more attention to mental health disease during this period.
Rovetta, A.; Castaldo, L.
Show abstract
Reproducibility and transparency represent some of the main problems of scientific publishing. Currently, the editorial requests of academic journals and peer reviewers can divert the authors attention from an accurate description of the methods adopted, thus compromising these fundamental scientific aspects. This paves the way for the voluntary falsification of data to obtain striking results. Furthermore, the excessive expansion of introduction and discussion sections increases the likelihood of introducing evaluation bias. Since peer reviewers are generally unpaid for their work, they are not required to reproduce the analysis of the studies they review but only to assess methodological accuracy, reproducibility, and plausibility. Therefore, this paper aims to emphasize that the methods and results sections are the central parts of quantitative analysis. In this regard, we firmly believe that the peer review process should, whenever possible, reproduce the analysis from scratch. Consequently, authors must be required to provide a simple and straightforward tutorial to reproduce the analysis as it was conceived both methodologically and chronologically. Ideas, insights, and discussions among the authors must also be reported. This complete description can be provided as integrative material published with the main manuscript, which is nothing more than a summary of methods and results. Such a procedure would represent the first step to improving the quality of scientific publications, waiting for unscientific concepts such as "publish or perish" to be eradicated from the academic world. In this manuscript, we provide a framework that can serve as a fully reproducible and transparent example of analysis. The aim is to investigate the Italian netizens web interest in paracetamol, ibuprofen, and nimesulide from 2015 to 2022, searching for causal associations with the fever symptom and COVID-19. The infodemiological tool "Google Trends" has been used to collect the data. Correlational analysis showed plausible causal associations between paracetamol, ibuprofen, and fever due to seasonal flu and COVID-19 and, although to a minor extent, COVID-19 vaccines side effects. Paracetamol was the most historically searched substance. However, the trend of ibuprofen has caught up with that of paracetamol in 2022. Interest in paracetamol, ibuprofen, and nimesulide increased substantially during the COVID-19 pandemic period. We conclude that web pharmacovigilance via Google Trends can provide relevant evidence for monitoring drug intake in relation to epidemiologically significant events such as epidemics and mass vaccination campaigns.
Panda, M.
Show abstract
The novel Corona-virus (COVID-2019) epidemic has posed a global threat to human life and society. The whole world is working relentlessly to find some solutions to fight against this deadly virus to reduce the number of deaths. Strategic planning with predictive modelling and short term forecasting for analyzing the situations based on the worldwide available data allow us to realize the future exponential behaviour of the COVID-19 disease. Time series forecasting plays a vital role in developing an efficient forecasting model for a future prediction about the spread of this contagious disease. In this paper, the ARIMA (Auto regressive integrated moving average) and Holt-Winters time series exponential smoothing are used to develop an efficient 20-days ahead short-term forecast model to predict the effect of COVID-19 epidemic. The modelling and forecasting are done with the publicly available dataset from Kaggle as a perspective to India and its five states such as Odisha, Delhi, Maharashtra, Andhra Pradesh and West Bengal. The model is assessed with correlogram, ADF test, AIC and RMSE to understand the accuracy of the proposed forecasting model.
Fout, A.; Bayham, J.; Gutilla, M. J.; Fosdick, B. K.; Pidcoke, H.; Kirby, M.; van Leeuwen, P. J.; Anderson, C.
Show abstract
For many institutions of higher learning, the beginning of each semester is marked by a significant migration of young adults into the area. In the midst of the COVID19 pandemic, this presents an opportunity for active cases to be introduced into a community. Prior to the Fall 2020 semester, Colorado State University researchers combined student home locations with recent case counts compiled by the New York Times to assign a probability to each individual of arriving with COVID19. These probabilities were combined to estimate that there would be 7.8 new cases among the on-campus population. Comprehensive testing of arriving students revealed 7 new cases, which validated the approach. The procedure was repeated to explore what could happen if students had returned to campus after Fall break. The estimate of 48 cases corroborated the Universitys early decision to transition to fully remote learning after break.
Alberto, I.; Alberto, N. R.; Altinel, Y.; Blacker, S.; Binotti, W.; Celi, L. A.; Chua, T.; Fiske, A.; Griffin, M.; Karaca, G.; Mokolo, N.; Naawu, D.; Patscheider, J.; Petushkov, A.; Quion, J. M.; Senteio, C.; Taisbak, S.; Tirnova, I.; Tokashiki, H.; Velasquez, A.; Yaghy, A.; Yap, K.
Show abstract
OBJECTIVEArtificial intelligence (AI) and machine learning are central components of todays medical environment. The fairness of AI, i.e. the ability of AI to be free from bias, has repeatedly come into question. This study investigates the diversity of the members of academia whose scholarship poses questions about the fairness of AI. METHODSThe articles that combine the topics of fairness, artificial intelligence, and medicine were selected from Pubmed, Google Scholar, and Embase using keywords. Eligibility and data extraction from the articles were done manually and cross-checked by another author for accuracy. 375 articles were selected for further analysis, cleaned, and organized in Microsoft Excel; spatial diagrams were generated using Public Tableau. Additional graphs were generated using Matplotlib and Seaborn. The linear and logistic regressions were analyzed using Python. RESULTSWe identified 375 eligible publications, including research and review articles concerning AI and fairness in healthcare. When looking at the demographics of all authors, out of 1984, 794 were female, and 1190 were male. Out of 375 first authors, 155 (41.33%) were female, and 220 (58.67%) were male. For last authors 110 (31.16%) were female, and 243 (68.84%) were male. In regards to ethnicity, 234 (62.40%) of the first authors were white, 103 (27.47%) were Asian, 24 (6.40%) were black, and 14 (3.73%) were Hispanic. For the last authors, 234 (66.29%) were white, 96 (27.20%) were Asian, 12 (3.40%) were black, and 11 (3.11%) were Hispanic. Most authors were from the USA, Canada, and the United Kingdom. The trend continued for the first and last authors of the articles. When looking at the general distribution, 1631 (82.2%) were based in high-income countries, 209 (10.5 %) were based in upper-middle-income countries, 135 (6.8%) were based in lower-middle-income countries, and 9 (0.5 %) were based in low-income countries. CONCLUSIONSAnalysis of the bibliographic data revealed that there is an overrepresentation of white authors and male authors, especially in the roles of first and last author. The more male authors a paper had the more likely they were to be cited. Additionally, analysis showed that papers whose authors are based in higher-income countries were more likely to be cited more often and published in higher impact journals. These findings highlight the lack of diversity among the authors in the AI fairness community whose work gains the largest readership, potentially compromising the very impartiality that the AI fairness community is working towards.